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AM  EXPLORATORY  STUDY  OF  THE  APPLICATION  OF  GENERALIZED  INVERSE 
TO  ILS  ESTIMATION  OF  OVERIDENTIFIED  EQUATIONS  IN  LINEAR  MODELS 


J.  Daniel  Khazzoom 


Abstract 

In  this  paper,  we  propose  a procedure  based  on  the  use  of  the  Moore- 
Penrose  inverse  of  matrices  for  deriving  unique  Indirect  Least  Squares 
(ILS)  estimates  of  the  structural  parameters  in  the  overidentified  case. 
The  procedure  makes  use  of  all  reduced  form  estimates  in  deriving  the 
unique  structural  estimates.  The  estimator  is  shown  to  be  consistent. 

We  derive  the  relationship  between  this  Two-Stage  Least  Squares  (2SLS) 
estimator  and  Instrumental-Variables  (l.V.)  estimators.  Wp  also  derive 
the  asymptotic  distribution  of  the  proposed  estimator.  The  results  of 
sampling  experiments  are  summarized. 


ib 


1*  1L3  (indirect  Least  Squares)  Set-up 

Let  the  operation  of  an  economic  system  he  characterized  by 

Y = Y * + X C + U. 

; (Tan)  (Txm)  (jnn)  (TxG)  (Gan)  (Tan) 

Y,  X,  and  U are  matrices  of  ^idogenous,  predetermined  and  random  vari- 
ables, respectively;  B and  Qf  are  parameter  matrices  of  B's  and  y's, 
respectively.  (Throughout,  ve  follow  essentially  the  notations  sythesizod 
by  Dhrymes  (3,  pp.  172-200,  279-365]  and  Rao  and  Mitra  18,  pp.  12-17].) 

The  dependent  variable  y . is  explained  by  m,  < m current  endogenous 
variables  and  G^  <_  G predetermined  variables.  We  make  the  usual  assump- 
tions on  the  random  matrix  U.  The  reduced  form  of  (l.l)  is  defined  as 

(i.2)  y = xc(i-b)"1  + u(i-b)*"1  = xn  + V . 


By  appropriately  partitioning  II  (~  indicates  estimates),  postmult iplying 

A A 

II  by  the  m * 1 column  [l,  -8  ^ , 0]'  and  rearranging  terms,  we  have 
the  usual  recursive  system  for  the  estimated  parameters  of  the  first  equa- 
tion in  (1.2): 
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It  is  well  known  that  (1.3)  has  a unique  solution  in  the  just- identified  case 
(that  is,  when  fig*  is  non-singular).  In  the  overidentified  case  (G*^,  and 

A 

1158  column  rank)  there  i.s  more  than  one  way  (although  a finite 
number  of  ways)  for  consistently  estimating  the  structural  parameters. 

Because  of  the  difficulty  in  choosing  among  these  alternative  ways,  ILS 
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has  fallen  into  practical  disuse.  Other  limited  information  estimators 

which  were  developed  in  the  meantime  compromise  in  various  ways  "between 

the  estimates  in  the  overidentified  case.  The  procedure  I propose  in  this 

paper  is  a compromise  in  the  same  nature  as  the  2SLS.  We  use  the  Moore- 

Penrose  (MP)  generalized  inverse  to  derive  unique  ILS  estimates  of  the 

structural  parameters  in  the  over identified  case.  For  a discussion  of  the 

MP  inverse,  see  [8,  pp.  50-55].  Briefly,  if  D is  an  m * n matrix,  its 

+ 

MP  inverse,  denoted  by  D , is  an  n * m matrix  which  satisfies  the 
following  four  conditions:  i)  DD+D  = D;  ii)  D+DD+  = D+; 
iii)  (DD  )'  = DD  ; iv)  (D  D)1  = D D where  1 denotes  conjugate  transpose 
and  where  the  inner  product  is  defined  with  respect  to  the  identity  matrix. 
D+  is  unique  and  has  the  same  rank  as  D.  A matrix  that  satisfies  (i) 
only  is  called  a g-inverse  and  usually  denoted  by  D-  .* 


2.  ILS  Estimates  Using  Moore-Penrose  Inverse 
Denote  equation  (1.3)  as 


(2.1) 


AA  A 

D6  = 7T  . . 
• X • X 


Since  D has  full  column  rank,  D+  = (DTD)'*1D, 
to  solve  (2.1),  we  get 

(2.2)  6 . = (D'D)  -Vff  . . 

• X • X 


[f  we  use  the  MP  inverse 


The  vector  6 . is  unique  and  has  the  property  that  it  is  minimum 

• X 

(Euclidean)  norm  least  squares  solution  of  (2.1).  Using  the  fact  that 
A+  = (D’D)"1!),  it  follows  that 


D+  = 


° 1 
i -n  „ (n.»  )+  . 

Gi”i  °“i  J 
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which  shows  that  6 ^ in  (2. 2)  coincides  with  the  recursive  solution  of 

(1.3)  » when  the  MP  inverse  is  used  to  solve  for  3 and  then  y , . 

• X • X 

A 

It  can  he  shown  that  6 is  a consistent  estimator  of  6 , . To 
see  this,  note  that  the  elements  of  D are  consistent  estimates  of  the 
corresponding  elements  in  the  true  D.  Since  D'D  is  non-singular,  it 
follows  that 

(2.3)  plim  6 . = (D’DrViT  = D+Tr  _ = 6 , . 

The  last  equality  follows  from  the  fact  that  in  the  population  the  system 
DS  « rr  is  known  to  he  a consistent  set  of  equations. 

*X  • X 


3.  Relation  of  ILS  to  2SLS  and  I.V.  Estimator;  Asymptotic  Distribution 
Writing  in  full  the  first  equation  in  (l.l),  we  have 


(3‘x)  y.i-V.i  + V.3 


u.i = ^.i^.i 


xi^.i 


+ u 


.1 


/V 

+ V 


.lP.l 


II  V ± consist  of  the  2nd,. . . , (m^+l)st  column  of  ft,  and  V, 
respectively.  Noting 


(3.2) 


(xn  ^,  xx)  = XD  , 


the  2SLS  solution  of  (3.1)  is  easily  seen  to  he 

(3.3)  <5  . * {(XD)'  (XD)}"1  (XD)'y  = {D^X'XJd}-1  D^X'XJtt  , 

•x  *X  «X 

where  we  made  use  of  the  fact  that  y = Xtt  + v and  X’v  =0  . 

• X • X • X « X 

Observe  (3.3)  is  the  minimum  norm  least  squares  solution  of 

(3.4)  XD§  . = Xtt  _ . 

• X • X 

By  comparing  (2.2)  with  (3.3)  it  is  evident  how  the  2SLS  and  ILS  compro- 
mise between  the  various  estimates  in  the  overidentified  case. 
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Both  procedures  solve  for  the  minimum  norm  least  squares  estimator.  The 
difference  is  in  the  definition  of  the  norm.  In  the  2SLS  case,  the 
(quadratic)  norm  in  (3.3)  is  defined  with  respect  to  the  moment  matrix 
of  all  the  predetermined  variables.  In  the  ILS  case,  the  norm  is  defined 
with  respect  to  the  identity  matrix.  The  two  estimators  coincide  when 
X’X  is  a scalar  matrix,  and  similarly  when  the  equation  is  just  iden- 
tified (6  is  then  square  and  non-singular).  In  light  of  (2.2)  and  (3.3), 
it  is  straightforward  to  infer  the  asymptotic  distribution  of  the 
proposed  ILS  estimator  from  the  asymptotic  distribution  of  the  2SLS  esti- 
mator. For  2SLS,  we  have  (see  Dhrymes  [3,  pp.  190-192]) 


(3.5) 
where 

(3.6) 


(6>x  - <S#1)  * N(0,  an  plim  #t)  , 


^ = 


. Sal . paf  , 
LiJfr.il-1  L*J  LTjp^j-i» 


l£Xl 

T 


(3.7) 


Z1  = (Y1  V * 


ylv 

The  subscript  ] indicates  the  matrix  with  respect  to  which  the 
inner  product  is  defined.  For  ILS,  we  have 


(3.8) 


/r  (6  1 - 5 i)  'V  N(0,  plim  Ipt)  , 


In  order  to  arrive  at  ^ we  simply  changed  the  norm  in  (3.6)  so  that 
inner  product  of  D+  is  now  defined  with  respect  to  the  identity  matrix 
rather  than  (X'X).  Equation  (3-9)  can  also  be  derived  directly. 
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It  is  also  useful  to  look  at  the  results  derived  so  far  from  the 
point  of  view  of  I.V.  estimation.  If  ve  choose  the  instrumental  variables 


(3.10) 


p = xCx’xpbc^  , 

Q = xtx'xr1 


and  rewrite  (3.6)  and  (3.9)  as 


(3.11) 

.1  J T L! 

(3.12) 

LT  J T L5 

■ 

we  see  that  the  righthand  side  of  (3.1l)  and  (3.12)  has  the  standard  form 
of  the  covariance  matrix  whose  probability  limit  appears  in  the  asymptotic 
distribution  of  I.V.  estimators  (except  that  in  (3.12)  we  have  MP  inverse 
instead  of  the  conventional  Inverse).  This  is  not  a surprising  result, 
Bince  it  is  well  known  that  2SL3  and  ILS  have  an  I.V.  interpretation, 
with  instruments  P and  Q,  respectively,  as  in  (3.10).  Where  (3.12) 
departs  from  conventional  results,  however,  is  in  the  number  of  instrumen- 
tal variables.'  Q has  G columns  where  G > m^  + G^,  whereas  it  is 
standard  to  require  the  number  of  instrumental  variables  to  be  the  same 
as  the  number  of  explanatory  variables  in  the  equation  (otherwise  the 
matrix  to  be  inverted  will  not  be  square  in  the  first  place).  Dhrymes 
[3,  p.  365],  for  example,  points  out  that  when  G > m^  + G^,  the  ILS  will 
fail  to  yield  unique  estimators  because  of  what  may  be  interpreted  as  the 
attempt  to  use  "too  many”  instrumental  variables  in  estimating  the  struc- 
tural parameters.  The  results  we  derived  in  this  section  indicate  that 
”too  mary”  instruments  is  not  really  a hindrance  for  deriving  unique  I.V. 
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estimates,  if  we  are  willing  to  work  with  the  MP  inverse.  (Note 
that  if  instead  of  P as  defined  in  (3.1Q),  we  choose  P = X(X'X)-1X' 
and  substitute  in  (3.1*0  below  for  this  alternative  choice  of  P,  we 
would  get  (3.*0,  which  we  already  know  yields  the  2SLS  estimator  when  MP 
inverse  is  used  to  solve  it). 

In  an  asymptotic  efficiency  sense,  the  2SLS  dominates  the  class  of  I.V. 
estimators  whose  instruments  belong  to  the  subspace  spanned  by  the  prede- 
termined variables  of  the  system.  For  the  two  covariance  matrices  (3.1l) 
and  (3.12),  the  relative  efficiency  (deleting  the  division  of  T), 

(3.13)  = DVll-PtP’PrVjQP*’'  , 

is  a p.s.d.  matrix,  since  l-P(P,P}~1P,  is  symmetric  idempotent.  To 
summarize:  the  ILS  estimator  proposed  in  this  paper,  as  well  as  the  2SLS 

estimator  in  the  overidentified  case,  achieves  a compromise  among  the 
various  ertimates  in  the  overidentified  case  by  finding  the  (unique)  minimum 
norm  least  squares  solution  for  S in 


(3.1*0 

P'y  , = P'Z.5 

• X X •« 

(3.15) 

Q,y.i  = Q'zi 

where  P and  Q are  defined  in  (3.10).  The  solution  of  (3.1*0  yields 
the  2SLS  estimator  and  the  solution  of  (3.15)  yields  the  ILS  estimator. 
In  both  cases,  the  solutions  have  the  same  structure;  they  differ  in  the 
definition  of  the  norm. 
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4.  Design  of  the  Monte  Carle  Experiments 

The  objective  of  the  experiments  is  (l)  to  gather  evidence  on  the 
relative  performance  of  ILS  vs.  2SLS  and  Limited  Information  Maximum 
Likelihood  (LIML)  estimator — the  two  most  commonly  used  single-equation 
consistent  estimators;  (2)  to  test  the  hypothesis  that  the  bias  for  the 
ILS  estimator  does  not  depend  on  (a)  the  size  of  the  covariance  matrix 
Z = (o^)  of  the  vector  (u^u^, . . . jU^)* » (b)  the  sparseness  of  £, 
and  (c)  the  sample  size;  (2)  to  test  the  hypotheses  that  the  relative 
performance  of  ILS  does  not  depend  on  the  factors  listed  in  (a)  to  (c). 
For  the  purpose  of  this  paper,  I chose  a structure  from  one  of  the 
experiments  reported  by  Cragg  [2].  (initially,  I estimated  several  runs 
using  the  structure  estimated  by  Wagner  [ll],  but  because  of  the  special 
nature  of  the  structure  used  by  Wagner — damped  difference  equations 
dominated  by  a trend  factor — I did  not  think  the  results  would  be  of 
general  interest.)  The  rationale  for  using  this  particular  structure  was 
to  permit  a comparison  of  ILS  with  2SLS  and  LIML  in  a model  for  which  the 
last  two  estimators  are  known  to  have  performed  very  well.  The  structure 
is  the  following: 


yl  = 
xt 


.89y2  +*l6y3  +44.00x1  +.7^x2  +.13x3 
t Jt  t t t 


+u. 


y * = •74y1 

t t 


+62.00x. 


+.96x„ 


+.70xe 


+.06x7  +u 

't 


y3  = 
Jt 


•29yc 


+40.00. 


V 


+.11X.  +.53xc  +.56x£ 
\ 5t  6t 


+u. 


where  x^  is  a vector  of  l’s.  In  conjure  cion  with  (see  below), 
i/ 

this  is  structure  8 reported  in  Cragg  [2,  p.  92].  In  all  experiments,  we 


estimated  the  parameters  of  the  first  equation  only,  using  ILS,  2SLS  and 
LIML.  An  experiment  consisted  of  generating  100  samples  of  Size  T = 60, 


40,  and  20  observations.  The  predetermined  variables  are  truly  exogenous 
and,  except  for  the  vector  of  constants  x^,  are  uniformly  and  indepen- 
dently distributed  random  numbers  with  values  in  1-100,100] . The  values 
of  the  exogenous  variables  were  fixed  for  repeated  samples  of  the  same 
size.  The  sample  correlation  matrices  for  the  exogenous  variables  used  in 
the  experiments  are  the  following: 


T=60 


T=40 


T=20 


x5  *6 


*3  *4 


X'3 

x4 

X5 

x6 


723 

~ .16 

1 

.10  -.05 

-.04  -.33 

-.02  -.24 

CM 

O 

• 

1 

o\ 

CM 

• 

CM 

O 

• 

J 

.45  -.08  -.16 

5 

.51  .33  -.19 

.02  .03  .01  .02 

.29  .02  .15  .19 

.09  .07  -.34  .15 

_.03  -.13  -.10  .05  -.13 

.00  -.04  .19  -.07  -.02 

-.01  -.19  .18  -.43  -.43 

The  structural  disturbances  were  generated  from  mutually  independent  and 


normally  distributed  (3-dimensional)  vectors  with  zero  mean  and  the  follow- 
ing E's: 


r 38.60 

- 5-92  36.68 

1-14.80  - 2.98  40.64 


38.60 

0 36.68 

0 0 4o.64 


386.0 
- 59.2 
-148.0 


366.8 

- 29.8  4o6,4 


366.8 

0 4o6. 4 ‘ 


In  conjunction  with  each  one  of  these  £' s,  we  carried  out  three  experi- 
ments with  T=60,  40,  20,  for  a total  of  12  experiments.  Several  algo- 
rithms are  available  in  the  literature  for  computing  the  MP  inverse.  I 
used  Johnson  and  Chou's  algorithm  [5J.  As  a check,  I cp1oulated  D D 
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for  several  of  the  D we  computed.  The  results  were  identical  to  the 
Identity  matrix  for  at  least  the  first  six  decimals. 

The  following  measures  of  the  relative  performance  were  calculated 
for  the  estimated  parameters  Cl]  arithmetic  mean,2  (2)  median,  (3)  standard 
deviation,  (4)  root  mean  square  error,2  (5/  number  of  the  estimates  within 
± 10%  of  the  true  parameter,  and  (6)  maximum  absolute  deviation  of  the 
estimates  from  the  parameters.  The  merits  and  limitations  of  several  of 
these  measures  have  been  discussed  by  several  authors.  See,  for  example. 
Summers  [9  , pp.  12-13];  Quandt  [7,  pp.  96-97];  and  Christ  [l,  pp.  475-476]. 

The  consistent  estimator  of  0^  provides  at  the  same  time  what  might 
be  viewed  as  a "non-predictive"  measure  of  the  overall  goodness  of  the 
estimates.  As  a second  measure  of  the  overall  goodness  of  the  estimates, 

I forecasted  y^  at  (l)  the  sample  mean  value  of  the  exogenous  variables 
and  (2)  at  the  sample  mean  value  of  the  exogenous  variable  plus  one 
standard  deviation.  y2  and  y^  were  fixed  at  their  theoretical  value 
for  the  forecasts.  For  each  experiment  I calculated  the  mean  and  median 
of  the  forecasted  y^.  As  a measure  of  the  magnitude  of  the  overall  bias 
of  the  estimates,  I also  calculated  the  norm,  16^  - where 

= (.89  .16  44.00  .74  .13),  and  5 ^ is  the  vector  of  the  average 
of  the  estimates  derived  from  the  estimator.  A similar  norm  was 
calculated  for  the  median. 

Finally,  for  inferential  purposes,  one  normally  needs  to  attach  to  the 

ft 

estimate  a of  a a measure  of  the  reliability  of  the  estimate.  For 

* 2 

LIML  and  2SLS,  the  measure  traditionally  used  is  , where  [o^]  is 

« 

a consistent  estimate  of  the  variance  of  a in  the  asymptotic  distri- 

* 

bution  of  a . The  idea  is  that  for  a relatively  large  sample,  the 
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distribution  of  »¥  (a  -a)/aa  is  adequately  approximated  by  a normal 
distribution  with  zero  mean  and  unit  variance.  I calculated  /r  (a  -a)/oa 
for  all  1LS  estimates,  and  used  the  Kolmogorov-Smirnov  test  to  test  for 
significant  departure  from  normality.  Similar  results  were  calculated 
for  2SLS  and  LIML  for  comparative  purposes. 

For  space  limitations,  I will  not  go  into  the  detail  of  the  results, 
but  give  a summary  of  the  results  in  the  next  section.  (Details  will  be 
made  available  to  interested  readers  upon  request.) 

5.  Summary  of  the  Results 
Relative  Performance 

ILS  vs.  2SLS:  ILS  bias  tends  to  be  smaller  than  2SLS  bias.  Esti- 

mates derived  from  the  two  procedures  do  not  appear  to  differ  significantly 
in  concentration  or  dispersion.  The  overall  goodness  of  the  estimates 
favors  ILS  over  2SLS. 

ILS  vs.  LIML:  ILS  bias  tends  to  be  larger  than  LIML  bias.  (But  the 

norm  of  the  bias  shows  the  performance  evenly  divided  VeVween  the  two  pro- 
cedures.) ILS  estimates  tend  to  be  more  concentrated  than  LIML  estimates. 
The  overall  goodness  of  the  estimates  favors  ILS  when  a non-predictive 
measure  is  used;  the  picture  is  mixed  when  a predictive  measure  is  used.3 

Effect  of  Sample  Size,  Size  of  £ and  Sparseness  of  £ on  ILS  Bias 

The  evidence  is  generally  inconsistent  with  the  hypotheses  that  ILS 
bias  does  not  depend  on  the  sample  size,  the  size  of  E,  and  the  sparse- 
ness of  E.  There  is  some  indication,  however,  of  an  interaction  between 
size  and  sparseness.  ILS  bias  does  tend  to  decrease  with  sparseness  when 
the  size  of  E is  large,  but  not  significantly  so  when  the  size  of  E is 
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small  to  begin  with.  Wore  research,  on  the  interaction  between  the  size 
and  sparseness  of  I may  yield  useful  results. 

EffecSt  of  Sample  Size.  Size  of  £ and  Sparseness  of  2 on  ILS  Relative 


Performance 

The  evidence  is  generally  not  inconsistent  with  the  hypotheses  that 
the  relative  performance  of  ILS  does  not  depend  on  the  sample  size,  the 
size  and  sparseness  of  2. 


Tests  of  the  Reliability  of  ILS  Estimates 


i)  As  the  sample  size  increases,  the  approximation  of  the  distribu- 
tion of  /F  (a*-d)/a*  by  the  asymptotic  distribution  gets  better.  This 
is  true  of  the  ILS,  as  well  as  2SLS  and  LIML. 

ii)  For  all  cases  considered,  LIML  comes  out  first  in  the  total  as 
well  as  in  each  sample  size,  followed  by  ILS  and  2SLS.  (A  similar  result 
was  noted  by  Cragg  [2,  pp.  101-102]  for  LIML  compared  with  other  consistent 
estimators  Cexcept  Full  Information  Maximum  Likelihood  (FIML)  estimator).) 

iii)  There  is  evidence  that  the  adequacy  of  the  normal  approximation 
may  depend  on  the  size  of  2.  With  T = 20  or  40,  the  approximation  appears 
to  work  better  when  the  size  of  2 is  smaller.  (Cragg  [2,  pp.  105-106] 
found  a similar  tendency  for  the  "t-ratio"  of  the  consistent  estimates  he 
examined. ) Generally,  the  same  remarks  apply  to  LIML  and  2SLS. 

iv)  The  adequacy  of  the  normal  approximation  does  not  appear  to  be 
influenced  by  whether  or  not  the  structural  disturbances  are  independent. 

For  ILS  (and  LIML)  this  was  the  case  regardless  of  sample  size.  For  2SLS, 
this  was  the  case  with  T = 1*0  and  60. 


6.  Concluding  Remarks 

A referee  thoughtfully  observed  that  the  attraction  of  the  proposed 
method  is  not  so  much.  in  its  possibly  superior  small  sample  behavior , but 
rather  in  the  possibility  of  obtaining  estimates  of  the  structural  para- 
meters without  recourse  to  the  data  once  the  reduced  form  is  estimated. 

The  reduced  form  estimate  is  all  the  real  world  has  to  give.  The  struc- 
tural constraints  ar;  the  result  of  theory  or  intuition.  The  method 
proposed  "keeps  these  two  sources  of  'information'  nicely  apart.  It  is 
one  step  further  on  the  way  to  unscrambling  the  curious  mixture  of  induc- 
tion and  deduction  which  is  so  characteristic  of  applied  econometrics." 

As  a follow-up  this  work  -will  be  extended  to  examine  extensively  thp 
sensitivity  of  the  estimates  to  alternative  specification  of  the  structural 
constraints  and  to  deal  with  other  aspects  of  the  Monte  Carlo  experiments 
that  I have  not  dealt  with  at  this  stage  (including  tbft  effect  of  multi- 
collinearity  among  the  exogenous  variables). 

A second  extension  relates  to  the  instrumental  variable  aspect,  which 
I only  touched  on  in  this  paper.  The  use  of  MP  inverse  opens  the  way  to  a 
family  of  l.V.  estimators  in  the  over identified  case,  which  have  the  same 
structure  but  diffe*-  in  the  definition  of  the  norm.  For  example,  by  pre- 
multiplying both  sides  of  (3.15)  by  X'X,  we  derive  the  equation  of 
another  l.V.  estimator  (different  from  ILS)  with  X as  the  matrix  of 
instruments.  (We  have  also  seen  that  in  the  overidentified  case  2SLS  can 
be  derived  by  using  X(X'X)~*^X’  instead  of  the  traditional  X(X'X)~1X'Z1, 
as  the  matrix  of  instruments,  if  ve  are  willing  to  work  with  the  MP  inverse.) 
The  behavior  of  several  l.V.  estimators  in  the  overidentified  case  and 
the  question  of  what  constitutes  an  appropriate  norm  will  be  investigated. 
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Footnotes 


•'Recently  Swsmy  and  Holmes  IlO]  and  Fisher  and  Wadycki  14]  used  g-i averse 
to  generalize  2SLS,  k class,  and  3SLS  estimators  so  that  they  can  he 
applied  to  large  econometric  models  vhen  the  sample  size  is  smaller  than 
the  number  of  predetermined  variables.  Unfortunately,  the  procedure 
proposed  by  the  authors  does  not  generalize  these  estimators  as  claimed. 
When  G >_  T and  rank  Cx)  = these  estimators  simply  to  not  exist. 

When  X has  full  row  rank,  X~  will  (minimally2  satisfy_conditions  (i)- 
(lii)  in  the  text.  Its  general  expression  is  X~  = (X'Xj’X*,  (see  [8], 
Theorem  3.2.2,  p.  49).  It  follows  Jihat  XX**  = I,  since  X(X'X)~X\  is 
invariant  for  any  choice  of  (X'X)~.  Hence2  the  general  solution  II  for 
the  systematic  part  of  (l.  2) — namely  ft  ~ X”f  + (l-X~X)W,  where  W is 
an  arbitrary  G x m matrix — is  also  the  general  expression  for  the  least 
squares  estimates  of  the  reduced  form,  which  will  always  satisfy 
XU  « X(X’X)~X*Y  = Y,  when  X has  full  row  rank.  In  the  jargon  of  2SLS, 
there  is  Just  no  way  in  which  the  matrix  of  endogenous  explanatory  vari- 
ables that  appear  in  the^ equation  of  interest  can  be  purged  of  its  stochas- 
tic component,  because  Y = Xn  = Y.  For  a similar  reason,  the  k-class  and 
3SLS  estimators  do  not  exist.  The  exception  occurs  when  perfect  multi- 
collinearity  exists  among  the  predetermined  variables  such  that  rank 
(X)  < T.  But  this  is  not  the  general  case  of  large  econometric  models,  as 
Fisher  and  Wadycki  14,  p.  463]  recognize. 

2Sc me  people  may  question  the  validity  of  this  measure,  unless  it  is  known 
that  the  corresponding  moment  in  the  population  exists.  Recent  results  hy 
Mariano  [6]  show  the  23LS  estimates  possess  the  first  two  moments  for  the 
models  we  estimated.  In  light  of  (2.2)  and  (3.3),  it  is  reasonable  to 
infer  the  same  is  true  of  the  IT5  estimates  we  derived  for  the  same  models. 
The  results  in  the  literature  do  not  show  LIML  possesses  a first-order 
rximent . Hence  for  ILS  and  2SLS  I used  the  mean  and  root  mean  square  error 
along  with  the  rest  of  the  measures,  hut  for  LIML  I confined  myself  to  the 
non-parametric  measures. 

3It  is  interesting  to  note  the  performance  when  the  covariance  is  1^. 

This  is  structure  8 taken  from  Cragg  [2]  and  for  which  2SLS  and  UML  esti- 
mates performed  well  in  Cragg* s experiments.  ILS  does  better  than  2SLS 
hy  every  summary  measure.  In  comparison  with  LIML,  ILS  bias  is  larger 
than  LIML  bias  (hut  the  norm  of  ILS  bias  is  smaller  in  two  out  of  three 
cases).  ILS  estimates  tend  to  he  more  concentrated  around  the  parameters 
than  LIML  estimates.  To  put  these  results  in  perspective,  I also  compared 
2SLS  and  LIML.  2SLS  bias  is  larger  than  LIML  bias;  so  is  the  norm  of  the 
bias.  Cragg* s results  showed  a slight  edge  in  favor  of  2SLS  [2,  p.  96, 
experiment  12].  The  two  measures  of  concentration  and  dispersion  do  not 
agree  on  the  relative  performance  of  2SLS  vs.  LIML. 
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